Information Gain Versus Gain Ratio: A Study of Split Method Biases

نویسنده

  • Earl Harris
چکیده

One corollary of the Cullen Scha er's Conservation Law of Generalization Performance indicates that no learner is generally better than another learner. If the rst learner performs better than the second learner on some learning situations, the rst learner must perform worse than the second learner on other learning situations. Unfortunately, the corollary does not provide a description of the circumstances where a speci c learner has an advantage. This article focuses on two decision tree learners. One uses the information gain split method and the other uses gain ratio. It presents a predictive method that helps to characterize problems where information gain performs c 2001 The MITRE Corporation. All Rights Reserved. better than gain ratio (and vice versa). To support the practical relevance of this research, it shows that the predictive method works e ectively on the contraceptive method choice problem from the Cal-Irvine Machine Learning Repository. This article brings new insight on how these two split methods a ect a decision tree learner's bias.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diagnosis of the disease using an ant colony gene selection method based on information gain ratio using fuzzy rough sets

With the advancement of metagenome data mining science has become focused on microarrays. Microarrays are datasets with a large number of genes that are usually irrelevant to the output class; hence, the process of gene selection or feature selection is essential. So, it follows that you can remove redundant genes and increase the speed and accuracy of classification. After applying the gene se...

متن کامل

The relationship between pregnancy weight gain and impaired glucose tolerance test

Impaired glucose tolerance has several adverse effects on growing fetus. In this study we evaluated the effect of excessive weight gain during pregnancy on the risk of glucose intolerance in pregnant women. A case-control study was conducted through which the glucose tolerance status after 100 gram oral glucose intake was compared between 60 pregnant women with maximum 10 weeks of gestation and...

متن کامل

Fuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection

Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...

متن کامل

Prediction of Gain in LD-CELP Using Hybrid Genetic/PSO-Neural Models

In this paper, the gain in LD-CELP speech coding algorithm is predicted using three neural models, that are equipped by genetic and particle swarm optimization (PSO) algorithms to optimize the structure and parameters of neural networks. Elman, multi-layer perceptron (MLP) and fuzzy ARTMAP are the candidate neural models. The optimized number of nodes in the first and second hidden layers of El...

متن کامل

Prediction of Gain in LD-CELP Using Hybrid Genetic/PSO-Neural Models

In this paper, the gain in LD-CELP speech coding algorithm is predicted using three neural models, that are equipped by genetic and particle swarm optimization (PSO) algorithms to optimize the structure and parameters of neural networks. Elman, multi-layer perceptron (MLP) and fuzzy ARTMAP are the candidate neural models. The optimized number of nodes in the first and second hidden layers of El...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002